MS&E 336 Lecture 14: Approachability and regret minimization

نویسنده

  • Ramesh Johari
چکیده

j 6=i Aj . We let ai denote a pure action for player i, and let si ∈ ∆(Ai) denote a mixed action for player i. We will typically view si as a vector in R Ai , with si(ai) equal to the probability that player i places on ai. We let Πi(a) denote the payoff to player i when the composite pure action vector is a, and by an abuse of notation also let Πi(s) denote the expected payoff to player i when the composite mixed action vector is s. The game is played repeatedly by the players. We let h = (a, . . . ,a) denote the history up to time T . The external regret of player i against action si after history h is:

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MS&E 336 Lecture 15: Calibration

Calibration is a concept that tries to formalize a notion of quality for forecasters. For example, suppose a weatherman predicts each day whether the it will rain, or be sunny. Typically forecasters will predict such events in terms of probabilities, i.e., “There is a 30% chance of rain.” Given only the outcome that day, it is impossible to judge the quality of such a forecast. However, if we c...

متن کامل

Robust approachability and regret minimization in games with partial monitoring

Approachability has become a standard tool in analyzing learning algorithms in the adversarial online learning setup. We develop a variant of approachability for games where there is ambiguity in the obtained reward that belongs to a set, rather than being a single vector. Using this variant we tackle the problem of approachability in games with partial monitoring and develop simple and efficie...

متن کامل

Response-Based Approachability and its Application to Generalized No-Regret Algorithms

Approachability theory, introduced by Blackwell (1956), provides fundamental results on repeated games with vector-valued payoffs, and has been usefully applied since in the theory of learning in games and to learning algorithms in the online adversarial setup. Given a repeated game with vector payoffs, a target set S is approachable by a certain player (the agent) if he can ensure that the ave...

متن کامل

Zero - Sum Games with Vector - Valued Payoffs

In this lecture we formulate and prove the celebrated approachability theorem of Blackwell, which extends von Neumann's minimax theorem to zero-sum games with vector-valued payoffs [1]. (The proof here is based on the presentation in [2]; a similar presentation was given by Foster and Vohra [3].) This theorem is powerful in its own right, but also has significant implications for regret minimiz...

متن کامل

Blackwell Approachability and No-Regret Learning are Equivalent

We consider the celebrated Blackwell Approachability Theorem for two-player games with vector payoffs. Blackwell himself previously showed that the theorem implies the existence of a “noregret” algorithm for a simple online learning problem. We show that this relationship is in fact much stronger, that Blackwell’s result is equivalent to, in a very strong sense, the problem of regret minimizati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007